Supervised Coupled Dictionary Learning with Group Structures for Multi-modal Retrieval
نویسندگان
چکیده
A better similarity mapping function across heterogeneous high-dimensional features is very desirable for many applications involving multi-modal data. In this paper, we introduce coupled dictionary learning (DL) into supervised sparse coding for multi-modal (crossmedia) retrieval. We call this Supervised coupleddictionary learning with group structures for MultiModal retrieval (SliM). SliM formulates the multimodal mapping as a constrained dictionary learning problem. By utilizing the intrinsic power of DL to deal with the heterogeneous features, SliM extends unimodal DL to multi-modal DL. Moreover, the label information is employed in SliM to discover the shared structure inside intra-modality within the same class by a mixed norm (i.e., `1/`2-norm). As a result, the multimodal retrieval is conducted via a set of jointly learned mapping functions across multi-modal data. The experimental results show the effectiveness of our proposed model when applied to cross-media retrieval.
منابع مشابه
Uncorrelated Multi-View Discrimination Dictionary Learning for Recognition
Dictionary learning (DL) has now become an important feature learning technique that owns state-of-the-art recognition performance. Due to sparse characteristic of data in real-world applications, DL uses a set of learned dictionary bases to represent the linear decomposition of a data point. Fisher discrimination DL (FDDL) is a representative supervised DL method, which constructs a structured...
متن کاملSelf-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval
Thanks to the success of deep learning, cross-modal retrieval has made significant progress recently. However, there still remains a crucial bottleneck: how to bridge the modality gap to further enhance the retrieval accuracy. In this paper, we propose a self-supervised adversarial hashing (SSAH) approach, which lies among the early attempts to incorporate adversarial learning into cross-modal ...
متن کاملA New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain
Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...
متن کاملUMass CIIR at TAC KBP 2013 Entity Linking: Query Expansion using Urban Dictionary
This paper describes the system submitted to the TAC 2013 entity linking task of the Knowledge Base Population track. The core of the approach is probabilistic information retrieval over a search index of the knowledge base, including the text of Wikipedia. The retrieval results are further reranked using a supervised learning-to-rank model. The submission this year builds on the neighborhood a...
متن کاملSupervised learning of bag-of-features shape descriptors using sparse coding
We present a method for supervised learning of shape descriptors for shape retrieval applications. Many contentbased shape retrieval approaches follow the bag-of-features (BoF) paradigm commonly used in text and image retrieval by first computing local shape descriptors, and then representing them in a ‘geometric dictionary’ using vector quantization. A major drawback of such approaches is that...
متن کامل